Using broad phonetic classes to guide search in automatic speech recognition

نویسندگان

Stefan Ziegler

Bogdan Ludusan

Guillaume Gravier

چکیده

This work presents a novel framework to guide the Viterbi decoding process of a hidden Markov model based speech recognition system by means of broad phonetic classes. In a first step, decision trees are employed, along with frame and segment based attributes, in order to detect broad phonetic classes in the speech signal. Then, the detected phonetic classes are used to reinforce paths in the search process, either at every frame or at phonetically significant landmarks. Results obtained on French broadcast news data show a relative improvement in word error rate of about 2% with respect to the baseline.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Towards Phonetically-Driven Hidden Markov Models: Can We Incorporate Phonetic Landmarks in HMM-Based ASR?

Automatic speech recognition mainly relies on hidden Markov models (HMM) which make little use of phonetic knowledge. As an alternative, landmark based recognizers rely mainly on precise phonetic knowledge and exploit distinctive features. We propose a theoretical framework to combine both approaches by introducing phonetic knowledge in a non stationary HMM decoder. To demonstrate the potential...

متن کامل

Bio-inspired Broad-class Phonetic Labelling

Recent studies have shown that the correct labeling of phonetic classes may help current Automatic Speech Recognition (ASR) when combined with classical parsing automata based on Hidden Markov Models (HMM). Through the present paper a method for Phonetic Class Labeling (PCL) based on bio-inspired speech processing is described. The methodology is based in the automatic detection of formants and...

متن کامل

Exploitation of Morphological Structures in Large Vocabulary Arabic Speech Recognition

This paper presents a new approach for large vocabulary Arabic speech recognition based on exploiting the morphological structures of the Arabic language. In this model, word discrimination is achieved by a hybrid analysis scheme, where vowels are described in detail while consonants are classifi ed according to broad phonetic classes. Different phonetic classifi cation strategies are used to d...

متن کامل

Segmentation of Continuous Speech Using Acoustic-phonetic Parameters and Statistical Learning

In this paper, we present a methodology for combining acoustic-phonetic knowledge with statistical learning for automatic segmentation and classification of continuous speech. At present we focus on the recognition of broad classes vowel, stop, fricative, sonorant consonant and silence. Judicious use is made of 13 knowledge-based acoustic parameters (APs) and support vector machines (SVMs). It ...

متن کامل

A Database for Automatic Persian Speech Emotion Recognition: Collection, Processing and Evaluation

Abstract Recent developments in robotics automation have motivated researchers to improve the efficiency of interactive systems by making a natural man-machine interaction. Since speech is the most popular method of communication, recognizing human emotions from speech signal becomes a challenging research topic known as Speech Emotion Recognition (SER). In this study, we propose a Persian em...

متن کامل

ذخیره در منابع من

ذخیره در منابع من قبلا به منابع من ذحیره شده

{@ msg_add @}

با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره شماره

صفحات -

تاریخ انتشار 2012

Using broad phonetic classes to guide search in automatic speech recognition

نویسندگان

چکیده

منابع مشابه

Towards Phonetically-Driven Hidden Markov Models: Can We Incorporate Phonetic Landmarks in HMM-Based ASR?

Bio-inspired Broad-class Phonetic Labelling

Exploitation of Morphological Structures in Large Vocabulary Arabic Speech Recognition

Segmentation of Continuous Speech Using Acoustic-phonetic Parameters and Statistical Learning

A Database for Automatic Persian Speech Emotion Recognition: Collection, Processing and Evaluation

عنوان ژورنال:

اشتراک گذاری